The Derivation of a Grammatically Indexed Lexicon from the Longman Dictionary of Contemporary English
نویسندگان
چکیده
We describe a methodology and associated software system for the construction of a large lexicon from an existing machine-readable (published) dictionary. The lexicon serves as a component of an English morphological and syntactic analyesr and contains entries with grammatical definitions compatible with the word and sentence grammar employed by the analyser. We describe a software system with two integrated components. One of these is capable of extracting syntactically rich, theory-neutral lexical templates from a suitable machine-readabh source. The second supports interactive and semi-automatic generation and testing of target lexical entries in order to derive a sizeable, accurate and consistent lexicon from the source dictionary which contains partial (and occasionally inaccurate) information. Finally, we evaluate the utility of the Longman Dictionary of Contemporary EnglgsA as a suitable source dictionary for the target lexicon.
منابع مشابه
Large Lexicons for Natural Language Processing: Utilising the Grammar Coding System of LDOCE
This article focusses on the derivation of large lexicons for natural language processing. We describe the development of a dictionary support environment linking a restructured version of the Longman Dictionary of Contemporary English to natural language processing systems. The process of restructuring the information in the machine readable version of the dictionary is discussed. The Longman ...
متن کاملAcquisition Of Computational-Semantic Lexicons From Machine Readable Lexical Resources
This paper describes a heuristic algorithm capable of automatically assigning a label to each of the senses in a machine readable dictionary (MRD) for the purpose of acquiring a computational-semantic lexicon for treatment of lexical ambiguity. Including these labels in the MRD-based lexical database offers several positive effects. The labels can be used as a coarser sense division so unnecess...
متن کاملA Class-based Approach to Word Alignment
This paper presents an algorithm capable of identifying the translation for each word in a bilingual corpus. Previously proposed methods rely heavily on word-based statistics. Under a word-based approach, frequent words with a consistent translation can be aligned at a high rate of precision. However, words that are less frequent or exhibit diverse translations generally do not have statistical...
متن کاملCombining Machine Readable Lexical Resources and Bilingual Corpora for Broad Word Sense Disambiguation
This paper describes a new approach to word sense disambiguation (WSD) based on automatically acquired "word sense division. The semantically related sense entries in a bilingual dictionary are arranged in clusters using a heuristic labeling algorithm to provide a more complete and appropriate sense division for WSD. Multiple translations of senses serve as outside information for automatic tag...
متن کاملClass Based Sense Definition Model for Word Sense Tagging and Disambiguation
We present an unsupervised learning strategy for word sense disambiguation (WSD) that exploits multiple linguistic resources including a parallel corpus, a bilingual machine readable dictionary, and a thesaurus. The approach is based on Class Based Sense Definition Model (CBSDM) that generates the glosses and translations for a class of word senses. The model can be applied to resolve sense amb...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1987